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Graph analysis is playing an increasingly important role in science and industry. Due to numerous limita- 
tions in sharing real-world graphs, models for generating massive graphs are critical for developing better 
algorithms. In this paper, we analyze the stochastic Kronecker graph model (SKG), which is the founda- 
tion of the Graph500 supercomputer benchmark due to its favorable properties and easy parallelization. 
Our goal is to provide a deeper understanding of the parameters and properties of this model so that its 
functionality as a benchmark is increased. We develop a rigorous mathematical analysis that shows this 
model cannot generate a power-law distribution or even a lognormal distribution. However, we formalize 
an enhanced version of the SKG model that uses random noise for smoothing. We prove both in theory 
and in practice that this enhancement leads to a lognormal distribution. Additionally, we provide a precise 
analysis of isolated vertices, showing that the graphs that are produced by SKG might be quite different 
than intended. For example, between 50% and 75% of the vertices in the Graph500 benchmarks will be 
isolated. Finally, we show that this model tends to produce extremely small core numbers (compared to 
most social networks and other real graphs) for common parameter choices. 

Categories and Subject Descriptors: D.2.8 [Software Engineering]: Metrics — complexity measures, per- 
formance measures; E.l [Data]: Data Structures — Graphs and Networks 

I I General Terms: Algorithms, Theory 

Additional Key Words and Phrases: graph models, R-MAT, Stochastic Kronecker Graphs (SKG), Graph500 

C/^ 1. INTRODUCTION 

I I The role of grapli analysis is becoming increasingly important in science and industry be- 

cause of the prevalence of graphs in diverse scenarios such as social networks, the Web, 
power grid networks, and even scientific collaboration studies. IVIassive graphs occur in a 
^ variety of situations, and we need to design better and faster algorithms in order to study 

them. However, it can be difficult to access to informative large graphs in order to test our 
algorithms. Companies like Netflix, AOL, and Facebook have vast arrays of data but cannot 
share it due to legal or copyright issues^. JMoreover, graphs with billions of vertices cannot 
be communicated easily due to their sheer size. 

As was noted in [Chakrabarti and Faloutsos 2006], good graph models are extremely 
important for the study and algorithmics of real networks. Such a model should be fairly 
easy to implement and have few parameters, while exhibiting the common properties of real 
networks. Furthermore, models are needed to test algorithms and architectures designed for 
large graphs. But the theoretical and research benefits are also obvious: gaining insight into 
the properties and processes that create real networks. 
^ The stochastic Kronecker graph (SKG) [Leskovec and Faloutsos 2007; Leskovec et al. 

2010], a generalization of the recursive matrix (R-IVIAT) model [Chakrabarti et al. 2004], 

^For example, Netflix opted not to pursue the Netflix Prize sequel due to concerns about lawsuits; see 
http : //blog .netflix . com/2010/03/this-is-neil-huiit-chief -product- officer .html 
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has been proposed for these purposes. It has very few parameters and can generate large 
graphs quickly. Indeed, it is one of the few models that can generate graphs fully in parallel. 
It has been empirically observed to have interesting real-network-like properties. We stress 
that this is not just of theoretical or academic interest — this model has been chosen to 
create graphs for the Graph500 supercomputer benchmark [GraphSOO Steering Committee 
2012]. 

It is important to know how the parameters of this model affect various properties of the 
graphs. We stress that a mathematical analysis is important for understanding the inner 
working of a model. We quote Mitzenmacher [Mitzcnmachcr 2006]: "I would argue, however, 
that without validating a model it is not clear that one understands the underlying behavior 
and therefore how the behavior might change over time. It is not enough to plot data and 
demonstrate a power law, allowing one to say things about current behavior; one wants 
to ensure that one can accurately predict future behavior appropriately, and that requires 
understanding the correct underlying model." 

1.1. Notation and Background 

We explain the SKG model and notation. Our goal is to generate a directed graph G = (V, E) 
with n = \V\ nodes and m = \E\ edges. The general form of the SKG model allows for an 
arbitrary square generator matrix and assumes that n is a power of its size. Here, we focus 
on the 2x2 case (which is equivalent to R-MAT), defining the generating matrix as 

with ti + t2 + ^3 + ^4 = 1 and minti > 0. 

i 

We assume that n — 2^ for some integer £ > 0. For the sake of cleaner formulae, we assume 
that £ is even in our analyses. Each edge is inserted according to the probabilities defined 
by 

P = T(g)T(E)---®T, 
\ ✓ 

£ times 

where (g) denotes the Kronecker product operation. In practice, the matrix P is never formed 
explicitly. Instead, each edge is inserted as follows. Divide the adjacency matrix into four 
quadrants, and choose one of them with the corresponding probability ^1,^2,^3, or t^. Once 
a quadrant is chosen, repeat this recursively in that quadrant. Each time we iterate, we end 
up in a square submatrix whose dimensions are exactly halved. After £ iterations, we reach 
a single cell of the adjacency matrix, and an edge is inserted. It should be noted that here 
we take a slight liberty in requiring the entries of T to sum to 1. In fact, the SKG model as 
defined in [Lcskovcc ct al. 2010] works with the matrix mP, which is considered the matrix 
of probabilities for the existence of each individual edge (though it might be more accurate 
to think of it as an expected value) . 

Note that all edges can be inserted in parallel. This is one of the major advantages of the 
SKG model and why it is appropriate for generating large supercomputer benchmarks. 

For convenience, we also define some derivative parameters that will be useful in subse- 
quent discussions. We let A = m/n denote the average degree and let a = ti + t2 — 0.5 
denote the skew. The parameters of the SKG model are summarized in Table I. 

1.2. Our Contributions 

Our overall contribution is to provide a thorough study of the properties of SKG and show 
how the parameters affect these properties. We focus on the degree distribution, the number 
of (non-isolated nodes), the core sizes, and the trade-offs in these various goals. We give 
rigorous mathematical theorems and proofs explaining the degree distribution of SKG, a 
noisy version of SKG, and the number of isolated vertices. 
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Table I: Parameters for SKG models 



Primary Parameters 



T = 



— generating matrix with + f2 + ^3 + ^4 — 1 



ti h 

I = number of levels (assumed even for analysis) 
m = number of edges 



Derivative Parameters 

— n — 1^ — number of nodes 

— /S. — m/n = average degree 

— a ^ ti + t2 — Q.h = skew 



(1) Degree distribution: We provide a rigorous mathematical analysis of the degree 
distribution of SKG. The degree distribution has often been claimed to be power- law, or 
sometimes lognormal [Chakrabarti et al. 2004; Leskovec et al. 2010; Kim and Leskovec 2010]. 
Kim and Leskovec [Kim and Leskovec 2010] prove that the degree distribution has some 
lognormal characteristics. Groer et al. [Groer et al. 2011] give exact series expansions for 
the degree distribution, and express it as a mixture of normal distributions. This provides 
a qualitative explanation for the oscillatory behavior of the degree distribution (refer to 
Figure 1). Since the distribution is quite far from being truly lognormal, there has been no 
simple closed form expression that closely approximates it. We fill this gap by providing a 
complete mathematical description. We prove that SKG cannot generate a power law distri- 
bution, or even a lognormal distribution. It is most accurately characterized as fluctuating 
between a lognormal distribution and an exponential tail. We provide a simple formula that 
approximates the degree distribution. 

(2) Noisy SKG: It has been mentioned in passing [Chakrabarti et al. 2004] that adding 
noise to SKG at each level smoothens the degree distribution, but this has never been formal- 
ized or studied. We define a specific noisy version of SKG (NSKG). We prove theoretically 
and empirically that NSKG leads to a lognormal distribution. (We give some experimen- 
tal results showing a naive addition of noise does not work.) The lognormal distribution 
is important since it has been observed in real data [Bi et al. 2001; Pennock et al. 2002; 
Mitzenmacher 2003; Clauset et al. 2009]. One of the major benefits of our enhancement 
is that only £ additional random numbers are needed in total. Using GraphSOO parame- 
ters, Figure 1 plots the degree distribution of a (standard) SKG and NSKG for two levels 
of (maximum) noise. We can clearly see that noise dampens the oscillations, leading to a 
lognormal distribution. We note that though the modification of NSKG is straightforward, 
the reason why it works is not. It involves an intricate mathematical analysis, which may 
be of theoretical interest in itself. 

(3) Isolated vertices: An isolated vertex is one that has no edges incident to it (and 
hence is not really part of the output graph) . We provide a formula that accurately estimates 
the fraction of isolated vertices. We discover the surprising result that in the GraphSOO 
benchmark graphs, 50-75% vertices are isolated; see Table II. This is a major concern for 
the benchmark, since the massive graph generated has a much reduced size. Furthermore, 
the average degree is now much higher than expected. 

(4) Core numbers: The study of fc-cores is an important tool used to study the structure 
of social networks because it is a mark of the connectivity and special processes that generate 
these graphs [Chakrabarti and Faloutsos 2006; Kumar et al. 2010; Alvarcz-Hamclin et al. 
2008; Gkantsidis et al. 2003; Goltsev et al. 2006; Carmi et al. 2007; Andersen and ChcllapiUa 
2009]. We empirically show how the core numbers have unexpected correlations with SKG 
parameters. We observed that for most of the current SKG parameters used for modeling 



A:4 



C. Seshadhri, A. Pinar, T. G. Kolda 



SKG 




Out Degree 



Fig. 1: Comparison of degree distributions (averaged over 25 instances) for SKG and two 
noisy variations, using the T from the Graph500 Benchmark parameters with ^ = 16. 



Table II: Expected percentage of isolated vertices and repeat edges, along with average 
degree of non-isolated nodes for the Graph500 benchmark. Excluding the isolated vertices 
results in a much higher average degree than the value of 16 that is specified by the bench- 
mark. 



I 


% Isolated Nodes 


% Repeat Edges 


Avg. Degree 


26 


51 


1.2 


32 


29 


57 


0.7 


37 


32 


62 


0.4 


41 


36 


67 


0.2 


49 


39 


71 


0.1 


55 


42 


74 


0.1 


62 



real graphs, max core numbers are extremely small (much smaller than most corresponding 
real graphs). We show how modifying the matrix T affects core numbers. Most strikingly, 
we observe that changing T to increase the max core number actually leads to an increase 
in the fraction of isolated vertices. 

1.3. Influence on GraphSOO benchmark 

Our results have been communicated to the Graph500 steering committee, who have found 
them useful in understanding the Graph500 benchmark. The oscillations in the degree 
distribution of SKG was a major concern for the committee. Our proposed NSKG model 
has been implemented in the current Graph500 code^. 

Our analysis also solves the mystery of isolated vertices and how they are related to the 
SKG parameters. Members of the steering committee had observed that the number of 
isolated vertices varied greatly with the matrix T, but did not have an explanation for this. 



^The file generator/graph-generator. c in the most recent version as of July 2012 (2.1.4) has the implementa- 
tion, with a variable SPK_NOISE_LEVEL controUing the NSKG noise. Available at http: //www.graphSOO . 
org/sites/def ault/f iles/f iles/graph500-2 . 1 . 4. teir .bz2 
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1.4. Parameters for empirical study 

Throughout the paper, we discuss a few sets of SKG parameters. The first is the Graph500 
benchmark [GraphSOO Steering Committee 2012]. The other two are parameters used in 
[Leskovec et al. 2010] to model a co-authorship network (CAHepPh) and a web graph 
(WEBNotreDame). We list these parameters here for later reference. 

— GraphSOO: T = [0.57, 0.19; 0.19, 0.05], t G {26, 29, 32, 36, 39, 42}, and m = 16 • 2^. 

— CAHepPh: T = [0.42, 0.19; 0.19, 0.20], £ = 14, and m = 237, 010. 

— WEBNotreDame^: T= [0.48, 0.20; 0.21, 0.11], f = 18, and m = 1,497,134. 

2. PREVIOUS WORK 

The R-MAT model was defined by Chakrabarti et al. [Cliakrabarti et al. 2004]. The general 
and more powerful SKG model was introduced by Leskovec et al. [Leskovec et al. 2005] 
and fitting algorithms were proposed by Leskovec and Faloutsos [Leskovec and Faloutsos 
2007] (combined in [Leskovec et al. 2010]). This model has generated significant interest and 
notably was chosen for the Graph500 benchmark [Graph500 Steering Committee 2012]. Kim 
and Leskovec [Kim and Leskovec 2010] defined the Multiplicative Attribute Graph (MAG) 
model, a generalization of SKG where each level may have a different matrix T. They suggest 
that certain configurations of these matrices could lead to power-law distributions. 

Since the appearance of the SKG model, there have been analyses of its properties. The 
original paper [Leskovec et al. 2010] provides some basic theorems and empirically show 
a variety of properties. Mahdian and Xu [Mahdian and Xu 2011] specifically study how 
the model parameters affect the graph properties. They show phase transition behavior 
(asymptotically) for occurrence of a large connected component and shrinking diameter. 
They also initiate a study of isolated vertices. When the SKG parameters satisfy a certain 
condition, the number of isolated vertices approaches n; however, their theorems do not 
help predict the number of isolated vertices for a given setting of SKG. In the analysis of 
the MAG model [Kim and Leskovec 2010], it is shown that the SKG degree distribution has 
some lognormal characteristics. (Lognormal distributions have been observed in real data 
[Bi et al. 2001; Pennock et al. 2002; Clauset et al. 2009]. Mitzenmacher [Mitzenmacher 2003] 
gives a survey of lognormal distributions.) 

Sala et al. [Sala et al. 2010] perform an extensive empirical study of properties of graph 
models, including SKG. Miller et al. [Miller et al. 2010] show that they can detect anomalies 
embedded in an SKG. Moreno et al. [Moreno et al. 2010] study the distributional properties 
of families of SKG. 

As noted in [Chakrabarti et al. 2004], the SKG generation procedure may give repeated 
edges. Hence, the number of edges in the graph differs slightly from the number of insertions 
(though, in practice, this is barely 1% for Graph500). Groer et al. [Groer et al. 2011] prove 
that the number of vertices of a given degree is asymptotically normally distributed, and 
provide algorithms to compute the expected number of edges in the graph (as a function of 
the number of insertions) and the expected degree distribution. 

3. DEGREE DISTRIBUTION 

In this section, we analyze the degree distribution of SKG, which are known to follow a 
multinomial distribution. While an exact expression for this distribution can be written, 
this is unfortunately a complicated sum of binomial coefficients. Studying the log-log plots 
of the degree distribution, one sees a general heavy-tail like behavior, but there are large 
oscillations. The degree distribution is not monotonically decreasing. Refer to Figure 2 for 
some examples of SKG degree distributions (plotted in log- log scale). Groer et al. [Groer 
et al. 2011] show that the degree distribution behaves like the sum of Gaussians, giving some 



•^In [Leskovec et al. 2010], I was 19. We make it even because, for the sake of presentation, we perform 
experiments and derive formulae for even I. 
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intuition for the oscillations. Recent work of Kim and Leskovec [Kim and Leskovec 2010] 
provide some mathematical analysis explaining connections to a lognormal distribution. But 
many questions remain. What does the distribution oscillate between? Is the distribution 
bounded below by a power law? Can we approximate the distribution with a simple closed 
form function? None of these questions have satisfactory answers. 

Our analysis gives a precise explanation for the SKG degree distribution. We prove that 
the SKG degree distribution oscillates between a lognormal and exponential tail. We provide 
plots and experimental results to support more intuition for our theorems. 

The oscillations are a disappointing feature of SKG. Real degree distributions do not 
have large oscillations (to the contrary, they are monotonically decreasing), and more im- 
portantly, do not have any exponential tail behavior. This is a major issue both for modeling 
and benchmarking purposes since degree distribution is one of the primary characteristics 
that distinguishes real networks. 

In order to rectify the oscillations, we apply a certain model of noise and provide both 
mathematical and empirical evidence that this "straightens out" the degree distribution. 
This is discussed in §4. Indeed, small amounts of noise lead to a degree distribution that is 
predominantly lognormal. This also shows an appealing aspect of our degree distribution 
analysis. We can naturally explain how noise affects the degree distribution and give explicit 
bounds on these affects. 

We make a caveat here. Technically, the SKG model creates multigraphs, since there can 
be repeated edges. Our theorems and expressions will deal with degree distributions of this 
multigraph. Conventionally, this is reduced to a simple graph by removing repeated edges. 
Groer et al. [Grocr ct al. 2011] give details expressions and explanations relating the degree 
distributions on the multigraph and the induced simple graph. Our empirical results show 
that for a variety of parameters (including the Graph 500 setting) , our theorems match the 
degree distribution of the underlying simple graph. Simple graphs are used in all empirical 
studies. 

3.1. Notation 

The ^-bit binary representation of the vertices, numbered to rt — 1, provides a straightfor- 
ward way to partition the vertices. Specifically, each vertex has a binary representation and 
therefore corresponds to an element of the boolean hypercube {0, 1}^. We can partition the 
vertices into slices, where each slice consists of vertices whose representations have the same 
number of zeros'*. Recall that we assume i is even. For r e [—£/2,i/2], we say that slice 
r, denoted 5^, consists of all vertices whose binary representations have exactly (£/2 + r) 
zeros. 

These binary representations and slices are intimately connected with edge insertions in 
the SKG model. For each insertion, we are trying to randomly choose a source-sink pair. 
First, let us simply choose the first bit (of the representations) of the source and the sink. 
Note that there are 4 possibilities (first bit for source, second for sink): 00, 01, 10, and 11. 
We choose one of the combinations with probabilities ti,t2,t^, and ^4 respectively. This fixes 
the first bit of the source and sink. We perform this procedure again to choose the second 
bit of the source and sink. Repeating £ times, we finally decide the source and sink of the 
edge. Note that as |r| becomes smaller, a vertex in an r-slice tends to have a higher degree. 

For a real number x, we use [x] to denote the closest integer to x. There are certain 
quantities that will be important in our analysis. These are summarized in Table III. 

Our results are fundamentally asymptotic in nature, so we explain the assumptions on 
T and the implicit assumptions of our results. We assume T to be a fixed matrix with the 
following conditions. All entries are positive and strictly less than 1. The number ti is the 



■^There are usually referred to as the levels of the boolean hypercube. In the SKG literature, levels is used 
to refer to £, and hence we use a different term. 
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Table III: Parameters for Analysis of SKG models 



General Quantities 

— r = (1 + 2<t)/(1 - 2ct) 

— A = A(l-4a2)«/2 

— re {— ^/2, . . . ,^/2} denotes a slice index 

— d denotes a degree (typically assumed < y/n) 

— deg(ti) = outdegree of node v 

— Sr = set of nodes whose binary representation have exactly £/2 + r zeros 
Quantities Associated with Degree d 



— Xd^ 


- random variable for the number of vertices of outdegree d 


- 9d = 


ln(d/A)/lnr 


- r^- 


\9j] (nearest integer to 64) 


— 7d = 


l^rf-Trfl e [0,0.5] 


Td^ 


\6d\ (only interesting for < 1/2) 


- 5d = 


dd - rd 



largest entry, and min(ii + t2,ti + ^3) > 1/2. This ensures that a e (0, 1/2), t is positive 
and finite, and A is non-zero. We want to note that these conditions are satisfied by all 
SKG parameters that have been used to generate realistic graph instances, to the best of 
our knowledge. Indeed, when a — 1/2, the degree distribution is Poisson. 

We fix the matrix T and average degree A > 1, and think of i as increasing. The asymp- 
totics hold for an increasing £. Note that since n = 2^, this means that n and m are also 
increasing. We use o(l) as a shorthand for a quantity that is negligible as £ — )■ 00. Typically, 
this converges to zero rapidly as i increases. Given two quantities or expressions A and B, 
A = (1 ± o{l))B will be shorthand for A e [{1 - o{l))B, (1 + o{l))B]. 

As we mentioned earlier, all our results are for the SKG multigraph. For convenience, we 
will just refer to this a graph. 

3.2. Explicit formula for degree distribution 

We begin by stating and explaining the main result of this section. To provide clean ex- 
pressions, we make certain approximations which are slightly off for certain regions of d 
and £ (essentially, when d is either too small or too large). Our main technical result is 
Lemma 3.2, which gives a tight expression for the degree distribution. A more interpretable 
version is given first as Theorem 3.1, which is stated as an upper bound. The remainder of 
the section gives a proof for this, which can be skipped if the reader is only interested in 
the results. This theorem expresses the oscillations between the lognormal and exponential 
tail. The lower order error terms in all the following are extremely small. 

We focus on outdegrees, but these theorems hold for indegrees as well. To make de- 
pendences clear, we remind the reader that the "free" variables are T, A, £. The first two 
are fixed to constants, and £ is increasing. Hence, the asymptotics are over £. All other 
parameters are functions of these quantities. 

We begin by giving a more digestible form of our main result, stated in Theorem 3.1. The 
more precise version is given in Lemma 3.2. A reader interested in the general message can 
skip Lemma 3.2. 

Theorem 3.1. Assume d G [{eln2)£,y/n]. IfTd > £/2, then 'E[Xd] is negligible, i.e., 
0(1); otherwise, ifTd < £/2, then (up to an additive exponential tail) 
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Out Degree 

(c) Graph500 (with I = 16) 

Fig. 2: We plot the degree distribution of graphs generated using our three different SKG pa- 
rameter sets. We then plot the respective bounds predicted by Theorem 3.1 and Lemma3.2. 
Observe how Theorem 3.1 correctly guesses the peaks and troughs of the degree distribu- 
tion. Lemma 3.2 is practically an exact match (except when the degree is below 2i or, in 
GraphSOO, slight inaccuracies when the degree is too large). 



Remark: This means that the expected outdegree distribution of a SKG is bounded above 
by a function that oscillates between a lognormal and an exponential tail. 

Note that = [ln((i/A)/lnr] = 0(lnd). Hence ''^^^ thought of as 

(f/2+e(inrf))- '^^^ function represents an asymptotically normal distribution of cc, 

and therefore ^ lognormal distribution of d. This lognormal term is multiplied by 

exp(-d72ln^T/2). By definition, 7^ £ [0, 1/2]. When 7d is close to 0, then the exponential 
term is almost 1. Hence the product represents a lognormal tail. On the other hand, when 
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7d is a constant (say > 0.2), then the product becomes an exponential tail. Observe that 7^ 
oscillates between and 1/2, leading to the characteristic behavior of SKG. As 9d becomes 
closer to an integer, there are more vertices of degree d. As it starts to have a larger frac- 
tional part, the number of vertices of degree d is bounded above by an exponential tail. Note 
that there are many values of d (a constant fraction) where 7^ > 0.2. Hence, for all these d, 
the degrees are bounded above by an exponential tail. As a result, the degree distribution 
cannot he a power law or a lognormal. 

The estimates provided by Theorem 3.1 for our three different SKG parameter sets are 
shown in Figure 2. Note how this simple estimate matches the oscillations of the actual 
degree distribution accurately. 

We provide a more complex expression in Lemma 3.2 that almost completely explains the 
degree distribution. Theorem 3.1 is a direct corollary of this lemma. In the following, the 
expectation is over the random choice of the graph. 

Lemma 3.2. For SKG, assume d e [(eln2)£, // > t/2, E[Xd] is negligible; 

otherwise, we have 

E[Xrf] = -^==J- <^ exp ' > ' 



e/2 + rd 
+ exp 



-d(l-<5rf)2ln2- 



1/2 + rd + l 



We plot the bound given by this lemma in Figure 2. Note how it completely captures 
the behavior of the degree distribution (barring a slight inaccuracy for larger degrees of 
the Graph500 graph because we start exceeding the upper bound for d in Lemma 3.2). 
Theorem 3.1 can be derived from this lemma, as we show below. 

Proof, (of Theorem 3.1) Since Sd ^ Od — [Od\ = Od — Vd, only one of 5d and (1 — (5^) is 
at most 1/2. In the former case, = and in the latter case, F^ = + 1. Suppose that 
Td = rd- Then, 

f-d(l~Sd)^l-aT\ f £ \ f-dlnT\ ( i 

exp ?i L /o , , J < exp ' 



,^/2 + rd + i;- 8 ;v^/2 + ^-d + iy 

Note that this is a small (additive) exponential term in Lemma 3.2. So we just neglect it (and 
drop the leading constant of l/%/27r) to get a simple approximation. A similar argument 
works when F^j = r^; + 1. □ 

In the next section, we prove some preliminary claims which are building blocks in the 
proof of Lemma 3.2. Then, we give a long intuitive explanation of how we prove Lemma 3.2. 
Finally, in §3.5, we give a complete proof of Lemma 3.2. 

3.3. Preliminaries 

We will state and prove some simple and known results in our own notation. This will give 
the reader some understanding about the various slices of vertices, and how the degree 
distribution is related to these slices. Our first claim computes the probability that a single 
edge insertion creates an outedge for node v. The probability depends only on the slice that 
V is in. 

Claim 3.3. For vertex v € Sr, the probability that a single edge insertion in SKG 
produces an out-edge at node v is 

(1 - ia^Y^^T- 

Pr = ■ 



A:10 



C. Seshadhri, A. Pinar, T. G. Kolda 



Proof. We consider a single edge insertion. What is the probabihty that this leads to an 
outedge of v7 At every level of the insertion, the edge must go into the half corresponding 
to the binary representation of v. If the first bit of v is 0, then the edge should drop in the 
top half at the first level, and this happens with probability (1/2 + cr). On the other hand, if 
this bit is 1, then the edge should drop in the bottom half, which happens with probability 
(1/2 — a). By performing this argument for every level, we get that 



^^ = U^v U^v = — 2^-[ij2—-^ = — - — • 

Our next lemma bounds the probability that a vertex v at slice r has degree d. Before 
that, we separately deal with slices where pr is very large. Essentially, we show that slices 
where p,. > Xj ^fm can be ignored. This allows for simpler calculations later on. 



Claim 3.4. Let R be the set {r\pr > l/\/m} and U — Urefl, '^f- probability that 
any vertex in U has degree less than •y/m/2 is at most e^^^^^K 



Proof. Consider a fixed v ^ U. Let Xi be the indicator random variable for the zth 
edge insertion being incident to v. The XiS are i.i.d. with E[Xi] > l/^/rn. The out-degree 
of V is X — J^TLi-^i ^i^d E[X] > y/m. By a multiplicative ChernofF bound (Theorem 4.2 
of [Motwani and Raghavan 1995]), the probability that X < is at most e~^/™/^. 

The proof is completed by taking a union bound over all vertices in U and noting that 
ne^^/* — er^^^y □ 

We will set d = o{y/n). Our formula becomes slightly inaccurate when d becomes large, 
but as our figures show, it is not a major issue in practice. The previous claim implies that 
the expected number of vertices in U (as defined above) with degree d is vanishingly small. 
Therefore, we only need to focus on slices where pr < \/ y/m. 

Lemma 3.5. Let v be a vertex in slice r. Assume thatpr < l/^/m and d — o{y/n). Then 
for SKG, 

P,[degW.dl.(l + „(l))^Jl3i_^ 

Proof. The probability that v has outdegree d is - Pr)"'~'^- Since d = o{y/n), 

we have (™) = (1 ± o{l))m'^ /d\. For x < l/y/m and m' < m, we can use the Taylor series 
approximation, (1 — a;)™ = (1 ± o(l))e~^™ . Using Claim 3.3, we get 



( Jrfd - P,)-" = (1 ± »(1))^ (5 ^ ) exp ( 

. (1 ± - *^:p'-" „p(-A(l - ,.yl^r'y.M'^^ 

d\ n 

Since pr < 1/ and d = o{\/n), dpr = o(l), completing the proof. □ 
3.4. Understanding the degree distribution 

The following is a verbal explanation of our proof strategy and captures the essence of the 
math. 

It will be convenient to think of the parameters having some fixed values. Let A = 1 and 
T = e. (This can be achieved with a reasonable choice of T, £, A.) We begin by looking at the 
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different slices of vertices. Vertices in a fixed r-slice have an identical behavior with respect 
to the degree distribution. Lemma 3.5 uses elementary probability arguments to argue that 
the probability that a vertex in slice r has outdegree d is roughly 

Pr[deg(.) = d]^ -P(-^;--^^ (1) 

When r = il{lnd), the numerator will be less than 1, and the overall probability is 0{l/d\). 
Therefore, those slices will not have many (or any) vertices of degree d. li r = O(lnd), the 
numerator is o{dl) and the probability is still (approximately) at most l/d\. Observe that 
when r is negative, then this probability is extremely small, even for fairly small values of 
d. This shows that half of the vertices (in slices where the number of I's is more than O's) 
have extremely small degrees. 

It appears that the "sweet spot" is around r w Ind. Applying Taylor approximations to 
appropriate ranges of r, it can be shown that a suitable approximation of the probability of a 
slice r vertex having degree d is roughly exp{~d{r — In d)^) . We can now show that the SKG 
degree distribution is bounded above by a lognormal tail. Only the vertices in slice r « In d 
have a good chance of having degree d. This means that the expected number of vertices 
of degree d is at most (f/2+ind)' Since the latter is asymptotically normally distributed as 
a function of Inc?, it (approximately) represents a lognormal tail. A similar conclusion was 
drawn in [Kim and Lcskovec 2010], though their approach and presentation is very different 
from ours. 

This is where we significantly diverge. The crucial observation is that r is a discrete 
variable, not a continuous one. When \r — \nd\ > 1/3 (say), the probability of having 
degree d is at most exp(— d/Q). That is an exponential tail, so we can safely assume that 
vertices in those slices have no vertices of degree d. Refer to Figure 3. Since Ind is not 
necessarily integral, it could be that for all values of r, \r — \nd\ > 1/3. In that case, there 
are (essentially) no vertices of degree d. For concreteness, suppose Ind = 100/3. Then, 
regardless of the value of r, \r — \nd\ > 1/3. And we can immediately bound the fraction 
of vertices that have this degree by the exponential tail, exp(— c?/9). When Ind is close 
to being integral, then for r = [In d] , the r-slice (and only this slice) will contain many 
vertices of degree d. The quantity | Ind — [lncf| | fluctuates between and 1/2, leading to 
the oscillations in the degree distribution. 

Let Td = [Ind] and jd = — Indj. Putting the arguments above together, we can 
get a very good estimate of the number of vertices of degree d. This quantity is essentially 
exp(— 7^0?) ' stated in Theorem3.1. A more nuanced argument leads to the bound 
in Lemma 3.2. 



3.5. Proof of Lemma 3.2 

We break up the main argument into various claims. The first claim gives an expression 
for the expected number of vertices of degree d. This sum will appear to be a somewhat 
complicated sum of binomial coefficients. But, as we later show, we can deduce that most 
terms in this sum are actually negligible. 



Claim 3.6. Define g{r) = r Inr - \n{d/X). Then, for SKG, 

e/2 



l±o(l) 



E 

r=-i/2 



exp 



d{l + g(r) - e^^'^) 



e 

£/2 + r 
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Fig. 3: Probability of nodes of degree d for various slices. The probability that a vertex of 
slice r has degree d is Gaussian distribution with a peak at In d. The standard deviation is 
extremely small. Hence, if Ind is far from integral, no slice will have vertices of degree d. 



Proof. Using Lemma 3.5 and linearity of expectation, we can derive a formula for ElXd 



We then apply Stirling's approximation and the fact that \Sr 

\d ill i^r\ - 



d\ ^ exp(Ar'~) + r 



l±o(l) /eA 
V d 



E 

r=-l/2 



r\d 



exp(AT'') + r 



Let us now focus on the quantity 
d J exp(AT'') 



exp((i + din A + rdlnr — dlnd — At''). 



The term inside the exponent can be written as d + d[r Inr — Ind + ln A) — d{d/X) ^t^ . This 
is d(l + g{r) - efW). Hence 



□ 



The key observation is that among the £ terms in the summation of Claim 3.6, few of 
them are the main contributors. All other terms sum up to a negligible quantity. We deal 
with this part in the following claim. We crucially use the assumption that d > (eln2)^. 
This ensures that the large slices (when |r| is small) do not contribute vertices of degree d. 



Improving Stochastic Kronecker Graphs 



A:13 



Claim 3.7. Let R be the set of r such that \g{r)\ > 1. Then, for SKG, 



^exp[d(l + 3(r)-e^W)]( /^J <1. 



Proof. For convenience, define h{r) — l + g{r) — e^'^^K We will show (shortly) that when 
\g{r)\ > 1, h{r) < -1/e. We assume d > (eln2)£, thus exp{d-h{r)) < 2-^. Let R be the set 
of all r such that \g{r)\ > 1. We can easily bound the contribution of the indices in R to 
our total sum as 



reR ^ ' ' reR 

It remains to prove the bound on h{r). Set h{x) = 1 + x — e^, so h{r) — h{g{r)). We have 
two cases. 

— gir) > 1: Since h{x) is decreasing when a; > 1, h{r) < h{l) = — (e — 2) < —1/e. 

— gif) < —1: Since h{x) is increasing for x < —1, h{r) < h{—l) = —1/e. □ 

Now for the main technical part. The following claim with the previous ones complete 
the proof of Lemma 3.2. 

Claim 3.8. Define R as in Claim 3.7. Then, for SKG, 



^exp d{l+g{r) 

r^R 



£ 

e/2 + r 



(l±o(l)). exp L,o , +exp' ^ ' ^' 



£/2 + rj ^ V 2 ; \£/2 + + 1 

Proof. Since \g{r)\ < 1, we can perform an important approximation. Using the expan- 
sion e^ = 1 + a; + x^/2 + Q{x'^) for x € (0, 1), we bound 

hir) - 1 + g{r) - e'^ « = -g{rf/2 + Qigirf) 

We request the reader to pause and consider the ramifications of this approximation. The 
coefficient multiplying the binomial coefficients in the sum is exp{-'d{g{r))'^), which is a 
Gaussian function of g{r). This is what creates the Gaussian-like behavior of the probability 
of vertices of degree d among the various slices. We now need to understand when g{r) is 
close to 0, since the corresponding terms will provide the main contribution to our sum. 
So for any d, some slices are "picked out" to have expected degree d, whereas others are 
not. This depends on what the value of g{r) is. Now on, it only requires (many) tedious 
calculations to get the final result. 

What are the different possible values of ^(r)? We remind the reader that g{r) = rlnr — 
ln(d/A). Observe that r^ — [ln(d/A)/lnrJ minimizes \g{r)\ subject to g{r) < and rd + I 
(which is the corresponding ceiling) minimizes \g{r)\ subject to g{r) > 0. For convenience, 
denote by (for floor) and r^; + 1 by t-^ (for ceiling). 

Consider some r such that \g{r)\ < 1. It is either of the form r = rc + s ot rf — s, for 
integer s > 0. We will sum up all the terms corresponding to the each set separately. For 
convenience, denote the former set of values of s's such that |(7(rc + s)| < 1 by 5'i, and define 
5*2 with respect to rf — s. This allows us to split the main sum into two parts, which we 
deal with separately. 
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Case 1 (the sum over Si): 

d(l + 5(r)-e»('')) 



seSi 



(l±o(l))exp( 



£/2 + rc + s 



-d{g{r,?) 



e/2 + r, 



-(l±o(l)) J2 exp( 



-%(r, + s)2). 



seSi 

s^O 



£ 

£/2 + rc + s 



We substitute g{rc + s) ~ g{rc) + sln r into the second part, and show that we can bound this 
whole summation as an error term. Note that both s and In t are positive by construction. 



^ cM-d{g{rc + s)')/2) 



£/2 + rc + s 
< Yl exp[-d(5(r,)2 + s2(inr)2)/2] 



£/2 + + s 



< 



exp{-d{g{r,f)/2) ^ exp(-ds2(lnT) V2) 



s>0 



e/2 



o exp(-d(.g(r,)^)/2) 



£ 

£/2 + r. 



For the last inequahty, observe that (^yj+t +s) — )■ Since d > £, the exponential 

decay of exp(8(— ds^)) completely kills this summation. 

Case 2 (the sum over 5*2) : Now, we apply an identical argument ioT r = rf — s. We 
have g{r) — g{rf) — slnr. Applying the same calculations as above. 



E 



exp 



rf(l+g(r)-e9('-)) 



£/2 + rf + . 



= {l±o{l))cM-d{g{rff)/2) 



£/2 + ri 



Adding the bounds from both the cases, we conclude 

£ 



^exp d(l+.g(r)-ef('')) 



r4B. 



£/2 + r 



(l±o(l)). exp(-d5(r/)V2) 



£I2 + Tf 



+ cxp(-d.9(r,) 2/2) 



£/2- 



(2) 



We showed earlier that rf — and Tc — + 1. We remind the reader that 9d = 
ln(d/A)/lnT, = L^dJ , and ^ 0d ~ r^. Hence g{rf) = g{0d) - S^Iiit = -J^lnr. 
Since rc = rj + 1, g{rc) = Inr + g{rf) = (1 — Sd) Inr. We substitute in (2) to complete the 
proof. □ 

4. ENHANCING SKG WITH NOISE: IMSKG 

Let us now focus on a noisy version of SKG that removes the fluctuations in the degree 
distribution. We will refer to our proposed noisy SKG model as NSKG. The idea is quite 
simple. For each level i < £, define a new matrix Ti in such a way that the expectation of 
Ti is just T. At level i in the edge insertion, we use the matrix Ti to choose the appropriate 
quadrant. 

Here is a formal description. For convenience, we will assume that T is symmetric. It is 
fairly easy to generalize to general T. Let b be our noise parameter such that b < min((ti + 
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Table IV: Parameters for NSKG 



b = noise parameter < min((ti 
Hi = noise at level i = 



T,, = 



ti+ti 



ti+ti 



— noisy generating matrix at level i = 1, 



t4)/2,t2). For level i, choose /ii to be a uniform random number in the range [—b,+b]. Set 
T,- to be 



2fj.iti 



ts, + /ij ^4 



-f Mi 

tl+t4 



Note that is symmetric, its entries sum to 1, and all entries are positive. This is by no 
means the only model of noise, but it is certainly convenient for analysis. Each level involves 
only one random number /i^, which changes all the entries of T in a linear fashion. Hence, 
we only need € random numbers in total. For convenience, we list out the noise parameters 
of NSKG in Table IV. 

In Figures 1, 4a, and 4b, we show the effects of noise. Observe how even a noise parameter 
as small as 0.05 (which is extremely small compared to the matrix values) significantly 
reduces the magnitude of oscillations. A noise of 0.1 almost removes the oscillations. (Even 
this noise is very small, since the standard deviation of this noise parameter is at most 0.06.) 
Our proposed method of adding noise dampens the undesirable exponential tail behavior 
of SKG, leading to a monotonic degree distribution. 




Out Degree Out Degree 

(a) CAHcpPh (b) WEBNotreDame 



Fig. 4: The figures show the degree distribution of standard SKG and NSKG as the averages 
of 25 instances. Notice how effectively a noise of 0.1 straightens the degree distribution. 
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4.1. Why does noise help? 

Before we state our formal theorem, let us set some asymptotic notation that will allow for 
a more readable theorem. We will use the O(-) notation to suppress constant factors, where 
(for notational convenience) these constants may depend on the constants in the matrix T. 
As before, o(l) is a quantity that goes to zero as I grows. 

Our formal theorem says that when the noise is "large enough," we can show that the 
degree distribution has at least a lognormal tail on average. This is a significant change 
from SKG, where many degrees are below an exponential tail. 

Theorem 4.1. Let noise h he set to c/\/J for positive c, such that c/^/i < min((ii + 
t4)/2,t2). Then the expected degree distribution for NSKG is bounded below by a lognormal. 
Formally, when Td < £/2 and d < ^fn, 



Here v(c) is some positive function of c. (This is independent of £, so for constant c, i>{c) 
is a positive constant.) 

This bound tells us that as £ increases, we need less noise to get a lognormal tail. From a 
Graph 500 perspective, if we determine (through experimentation) that for some small £ a 
certain amount of noise suffices, the same amount of noise is certainly enough for larger £. 

We now provide a verbal description of the main ideas. Let us assume that A = 1 and 
r = e, as before. We focus our attention on a vertex v of slice r, and wish to compute the 
probability that it has degree d. Note the two sources of randomness: one coming from the 
choice of the noisy SKG matrices, and the second from the actual graph generation. We 
associate a bias parameter with every vertex v. This can be thought of as some measure 
of how far the degree behavior of v deviates from its noiseless version. Actually, it is the 
random variable lnp„ that we are interested in. Intuitively, this can just be thought of as a 
Gaussian random variable with mean zero. The distribution of py is identical for all vertices 
in slice r. (Though it does not matter for our purposes, for a given instantiation of the noisy 
SKG matrices, vertices in the same slice can have different biases.) 

We approximate the probability that v has degree d by (refer to Claim 4.11) 



After some simplifications, this is roughly equal to exp(— (i(r — Ind — Inp^,)^). The additional 
In pt, will act as a smoothing term. Observe that even if \ad has a large fractional part, we 
could still get vertices of degree d. Suppose Ind = 10.5, but Inp^, happened to be close 
to 0.5. Then vertices in slice [Ind] would have degree d with some nontrivial probability. 
Contrast this with regular SKG, where there is almost no chance that degree d vertices 
exist. 

Think of the probability as exp((i(r — hid — XY), where X is a random variable. The 
expected probability will be an average over the distribution of X. Intuitively, instead of 
the probability just being exp((i(r — Ind)^) (in the case of SKG), it is now the average 
value over some interval. If the standard deviation of X is sufficiently large, even though 
exp(d(r — Ind)^) is small, the average of exp(d(r — Ind — X)^) can be large. Refer to FigureS. 

We know that X is a Gaussian random variable (with some standard deviation a). So we 
can formally express the (expected) probability that v has degree d as an integral, 




Pr[deg('i;) — d\ — exp((ir + dlnpt, — p^e^ 



)ld\. 




■ + CX3 



This definite integral can be evaluated exactly (since it is just a Gaussian). Intuitively, this is 
roughly the average value of exp((i(r — Inrf — X)^), where X ranges from —a to +(t. Suppose 
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Fig. 5: The effect of noise. Tlie underlying Gaussian curve is tfie same as one in Figure 3. 
Adding noise can be tfiougfit of as an average over the Gaussian. So the probabihty that a 
vertex from shce r has degree is the area of the shaded region. 

(T > 1. Since r ranges over the integers, there is always some r such that \r — lnd| < 1. For 
this value of r, the average of exp((i(r — Ind— X)"^) over the range X G [—1, +1] will have a 
reasonably large value. This ensures that (in expectation) many vertices in this slice r have 
degree d. This can be shown for all degrees d, and we can prove that the degree distribution 
is at least lognornial. 

This is an intuitive sketch of the proof. The random variable In p„ is not exactly Gaussian, 
and hence we have to account for errors in such an approximation. We do not finally get a 
definite integral that can be evaluated exactly, but we can give good bounds for its value. 

4.2. Preliminaries for analysis 

There are many new parameters we need to introduce for our NSKG analysis. Each of these 
quantities is a random variable that depends on the choice of the matrices Ti. We list them 
below. 



■ ai—ti 

■ ai = (l/2 + tTi)/(l/2 + (T). It will be convenient to express this in terms of /i^, replacing 
the dependence on ai. 

a, = (1/2 + a,)/{h +t2)^l- M» i^' w/'^ , ^ 

(tl + t2)[ti + t4) 

Pi = {1/2 — (Ti)/(l/2 — a). Performing a calculation similar to the one above, 

ih - U) 



ba,bp: We set 



Similarly, 



/3, = (l/2-(T,)/(t3+t4) = 

b{ti - U) 



Aba 



{ti+t2){h+ti) il + 2a){ti+U) 

b{ti - ti) _ Aba 
{h+U){ti+U) ~ {l-2a){ti+U) 
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Hence, Ui is distributed uniformly at random in [1 — + ha], and Pi is uniformly 
random in [1 - bp^l + bp]. Note that ba^bp = Q{c/Vi)- 
— Pvi Let V be represented as a bit vector (zi, . . . , z/c). The bias for v is p-u = 

t\i:z,=0 a« U^:^, = l P^■ Wc SCt A„ = Ap„ . 

4.3. The behavior of In p„ 

We need to bound the behavior of Inpt,, which is X^i-z =o ^"^"^^ ~^ =i ^'^/'i- Observe 
that this is a sum of independent random variables. By the Central Limit Theorem, we 
expect In to be distributed as a Gaussian, but we still need to investigate the variance of 
this distribution. Approximately (since ba and bp are small), Inui is uniformly random in 
[—bon ba], SO the variance of Incti is 0(6^) = 0(l/£). A similar statement holds for ln/3i, and 
we bound the variance of Inp^, by 6(1). So the probability density function (pdf) of lnp„ is 
roughly concentrated in a constant-sized interval of size 1 (around 0). This is what we will 
formally show in this section. We will need a pointwise convergence guarantee for the pdf 
of Inp^. Throughout this section, we will use various functions of the form i^i(c), z/2(c), . . .. 
These are strictly positive constant functions of c (for c > 0), and are a convenient way 
of tracking dependences on c. The reader should interpret Vaic) to be some constant that 
depends on c (and T and A, which are fixed), but is independent of ^. The main lemma of 
this section is the following. 

Lemma 4.2. Set f = max(lnT, 2). Let fv{x) he the pdf o/lnp„. For \x\ < t, fv{x) > 
vi{c). 

We will first prove Lemma 4.2 as a direct result of two claims stated below. Then we will 
prove these claims in the subsequent subsections. The first claim, the more technical of the 
two, shows that In p^ has a sufficiently large probability of attaining a constant value. 

Claim 4.3. There exists a constant C > t, such that the probability that lnp„ lies in 
[t, C] is at least 1^2(0) and that of lying in [~C,—t] is also at least 1^2(0). 

The next claim will be a consequence of the unimodularity of fv{x). 

Claim 4.4. For any x € [xi,X2], fv{x) > mm{fy{xi), fy{x2)) ■ 

Now for the proof of Lemma 4.2. 

Proof, (of Lemma 4.2) By Claim 4.3, the probability that lnp„ lies in / := C, — r] is 
at least 1^2 (c). Therefore, (C — t) max^g/ fv{x) > V2{c)- Suppose the maximum is achieved 
at xi. This means that there exists xi e [— C, — t], fv{xi) = f2(i'2(c)). Similarly, there exists 
some X2 G [t,C] such that fv{x2) — f^('^2(c)). Observe that for any x such that < r, 
X G [xi,X2]- By Claim 4.4, for any such x, /„(x) = ft{v2{c)). Therefore, we can bound 
fv{x) > i'i(c), for some positive function vi. □ 

4.3.1. Proving Claim 4.3. We begin with notational setup. We fix some vertex v. For conve- 
nience, define the variables Si (for all i < €}. If Zi = 0, set a.i = Ui and ai ~ j3i otherwise. 
We can write In p^ = InS^. The random variable a.i is uniform in [1 — 6^, 1 + 6^], where 
bi is either ba or hp appropriately. Set the zero mean random variable Xi = InS^ — E[lnS!i]. 
We have the following series of facts. 

Claim 4.5. 

— The pdf o/lnSi, denoted by hi[x), is given as follows. For x G [ln(l — 6i),ln(l + hi)], 
hi{x) = jlbi, and zero otherwise. 

- |E[lnS,]| = O(cV^), E[X2] = e(cV^), and E[|X,|3] - 0{c^[X^]/ sfl). 

Proof. The pdf of a.i is ha{x) ~ 1/26q, for a; G [1 — ba, 1 + ba] and zero otherwise. For 
any monotone function F{x), the pdf of F{ai) is given by \dF^^ {x)/dx\h{x). Setting F as 
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the function In, the pdf of In a^, hi{x), is given by /2ha for x G [ln(l — fea), ln(l + ha)] and 
zero otherwise. 



E[lnS,] = 
Using integration by parts, 

r-ln(l+bi) 

xe^dx = \xi 



ln(l+6.) 



xhi{x)dx = {2bi) ^ 



In(l-fci) 



ln(l+6i) 



ln(l+b,) 



xe^dx 



In(l-bi) 



ln(l+hi) 



'dx 



ln(l-b,) ln(l-b,) Jln(l-fci) 

= [(1 + 6,) ln(l + b,) - (1 - 6,) ln(l - - [(1 + h) - (1 - 6.)] 
= hi ln(l - 6- ) + ln(l + bi) - hi(l - bi) - 26, 
Taking absolute values, 

/.In(l+bi) 

/ xe'^dx < \bi \n{l ~ b'f)\ + \\n{l + b,) -\n{l-bi)-2b,\ 

'ln(l-bi) 

The first term is at most 2bf. For the second term, we need a finer Taylor approximation. 

ln(l + 6,) - ln(f - b,) - 2b, < (6, - bf/2 + bf) - {-b, - bl/2) - 2b, < bf 
ln(f + h) - ln(l - 6,) - 2b, > (b, - b^j2) - {-b, - bj/2 - bl) - 2b, > -&f 



All in aU, |E[lnS,;]| < 0{b^) ^ 0{c^/£). 

E[X2] =E[(lna,f]-(E[lna,])2 

/•ln(l+bO 

E[(lnS,)^] = (26.)"^ / x^e'^dx 

^ln(l-bi) 

To get an upper bound for this term, we use the following inequalities: ln(f + ba) < 26^, 

ln(l - ba) > ~2ba, < e. That gives E[(lna,)2] < e{2b,)-^ J^l\^x^dx = 0{bf). For a 

lower bound, we use: ln(l + ba) > 6q/2, ln(f — ba) < —ba/2, > 1/e. Hence, E[(ln5;j)^] > 

{2eb,)-'^ j'^'l^^^x'^dx = ^{bf). Note that (E[lnai])2 < bf, which is much small than 6f for 

sufficiently smaU b,. We conclude that 'E[Xf] = e{bf) = e{c^/£). 

For the final bound, we use a trivial estimate. We have E[|Xi|^] < max(|Xi|)E[Xf] < 

2b,nx!]. □ 

We now state the Berry-Esseen Theorem [Berry f94f; Esseen 1942], a crucial ingredient 
of our proof. This theorem bounds the convergence rate of a sum of independent random 
variables to a Gaussian. 

Theorem 4.6. [Berry-Esseen] Let Xi, X2, ■ ■ ■ , Xg be independent random variables with 

E,[Xi] = 0, E[Xf] = ^f, and E[\X,\^] = l, < 00. Let S be the sum J2, Let F{x) 

denote the cumulative distribution function (cdf) of S and be the cdf of the standard 

normal (the pdf is (27r)~"'^/^e~^ ^^j. Then, for an absolute constant Ci > 0, 
sup \Fix) - <P{x)\ <C\[J2 E 

i i 

Proof, (of Claim 4.3) We set X = J^i^i = (lnp^-E[lnp,„])/^2. E[Xf]. By Claim 4.5, 
|E[lnp,]| = |E,E[lnS,]| < ^jE[lna,]| = 0{c^) and E[Xf ] = e{c^). Note that X is 



A:20 



C. Seshadhri, A. Pinar, T. G. Kolda 



just an increasing linear function of Inp^. Set function r{x) ~ {x — E[lnp^])/-y/^j E[Xf], 
so X = r(lnp^). For any interval / = [xi,a;2], Pr[lnp^ & I] = Pr[X g [r{xi),r{x2)]]- Since 
|r(r)| is some constant function of c, we can find a constant C such the r(C) is strictly 
larger than |r(r)|. Setting yi — r(T), y2 — r{C) and using the notation from Theorem 4.6, 

Pt[X e [yi, 2/2]] = F{y2) - Fiy,) = ^(ya) - Hvi) + {F{y2) - $(2/2)) + - F{y^)) 

> $(2/2) - $(2/1) - \F{y2) - $(2/2)1 - \F{yi) - <i>(j/i)|. 

Since yi < 2/2 and are constant functions of c, $(2/2) ^ *i'(2/i) ^ ^^3(2)- By the Berry- 
Esseen theorem (Theorem4.6), |F(a;2) - $(x2)| + \F{xi) - $(a;i)| < 2Ci{J2,^fy^^^J2,^z- 
By Claim 4.5 = 0{c(,f/V() and 4^ = 6(c^). So the Berry-Esseen bound is at most 
2Cic(Xj = 0{1/Vi). By setting C to be a large enough constant, we can ensure 

that <fiy2) - Hyi) > 2CiciJ:j^f)-'/\ 

We deduce that Pt[X E [xi,X2]] > i^2{c), for some positive function 1/2. A similar proof 
holds for [-C, -f]. □ 

4.3.2. Proving Claim 4.4. We state some technical definitions and results about convolutions 
of unimodal functions. 

Definition 4.7. A pdf f{x) is unimodal if there exists an a G M such that / is non- 
decreasing on {—CO, a) and non-increasing on (a, 00). 

A pdf f{x) is log-concave if Q :— {x : f{x) > 0} is an interval and ln/(x) is a concave 
function (on the interval Q). 

A theorem of Ibragimov [Ibragimov 1956] gives some convolution properties of unimodal 
log-concave functions. 

Theorem 4.8. [Ibragimov] Let f{x) be a unimodal log-concave pdf and g{x) be a uni- 
modal pdf. The convolution f * g is also unimodal. 

Claim 4.9. The pdf fy{x) is unimodal. 

Proof. We have Inp^ — ^-Ina^. By Claim 4.5, the pdf of InS^ is hi{x) = e^/2bi. 
Note that hi{x) is unimodal. Furthermore, Inhi^x) = x — ln2&i, which is concave. Since 
Inpt, is the sum of independent random variables, the pdf fv{x) is the convolution of the 
individual pdfs. Repeated applications of Ibragimov's theorem (Theorem 4.8) tells us that 
fv{x) is unimodal. □ 

Proof, (of Claim 4.4) By the unimodality of /„, f^ is either non-decreasing, non- 
increasing, or non-decreasing and then non- increasing in the interval [a;i,X2]. Regardless 
of which case, for any y £ [xi,X2\, f{y) > min(/(a;i), /(X2)). □ 

4.4. Basic claims for NSKG 

We now reprove some of the basic claims for NSKG. Note that when we look at E[Xrf], the 
expectation is over both the randomness in T and the edge insertions. We use T to denote 
the set of matrices Ti, T2, . . . , T^. Conditioning on T simply means conditioning on a fixed 
choice of the noise. 

Claim 4.10. Let vertex v E Sr- Choose the noise for NSKG at random, and let be 
the probability (conditioned on T) that a .single edge insertion produces an out-edge at v. 
(Note that Pv is itself a random variable, where the dependence on T is given by py.) 

. _ (1 - 4a2)^/2rV, 
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Proof. This is identical to the proof of Claim 3.3. Consider a single edge insertion. For 
an edge insertion to be incident to v, the edge must go into the half corresponding to the 
binary representation of v. If the ith bit of v is 0, then the edge should drop in the top half 
at this level, and this happens with probability (1/2 + ai). On the other hand, if this bit is 
1, then the edge should drop in the bottom half, which happens with probability (1/2 — ct^). 
Let the bit representation of v be (zi, Z2, ■ ■ ■ , zi). Then, 



Pv 



2 7 J-J- V2 , , 

i:Zi—0 i:Zi — l i:Zi—0 i:Zi—l 



1 ^^(i_4^2)£/2 + _ (l-4a2)£/2^. 



T Pv 



□ 

As before, we will assume that — o{l/\/rri) and d — o{\/n). Even though py is a 
random variable, the probability that it is larger than ^Jra can be neglected. (This was 
discussed in more detail before Lemma 3.5). We stress that in the following, the probability 
that V has outdegree d is itself a random variable. 



Claim 4.11. Let v be a vertex in slice r, d = o{^/n), and py = o{l/^/m). Then for 
NSKG, we have 

Pr[deg(w) =d|T] = (l±o(l))> ^ ' 



d\ exp(At,T'') 

Proof. We follow the proof of Lemma 3.5. We approximate (™) by rrfi ld\ and (1 — a;)™^'* 
by e"^™, for x = o{l/ ^/rn) and d = o{^/n). This approximation is performed in the first step 
below. We remind the reader that = Xpy. By Claim 4.10 and the above approximations. 



(1 - 4a2)VVV,„ \ V (1 - 4a2)^/2rV, ^ 



d J ^ \ d / \ n 

(i±o(i))™; 

dl 



■ exp 



n 

(1 - 4cr2)^/ 
n 



^ ^ d\ expiXpyT"-) 



4.5. Bounds for degree distribution 

We complete the proof of Theorem 4.1. We break it down into some smaller claims. By and 
large, the flow of the proof is similar to that for the standard SKG. The main difference 
comes because the probabilities discussed in Claim 4.11 are random variables depending 
on the noise. The following claim is fairly straightforward, given the previous analysis of 
standard SKG. This is where we apply the Taylor approximations to show the Gaussian 
behavior depicted in Figure 3. 
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Claim 4.12. Consider some setting of the NSKG noise. Define gy{r) = r Inr— ln(rf/A„). 
The expected number of vertices of degree d conditioned on T is 

£/2 

E[X,|T] = i^ Y: ^exp[-ci5.(r)V2] 

Proof. By fixing some T, the A„s are fixed. We use Claim 4.11, linearity of expectation, 
and Stirling's approximation in the following. 



r=-m.es. d!exp(A„r'-) 
l±_o(l) (eKV {r^Y 



Choose 



\ d J exp(A„r'') 



exp((i + din A„ + rdlnr — rflnrf — XyT^). 



Define fv{r) = rdluT — XyT^ — d\nd + dlnXy + c?, where r is an integer. We have r = 
(In d — In Ai, + g„(r))/ Inr. 

fy{r) ^ d\nd- dlnXy + dgy{r) - e^^'^^'^d- dlnd + dlnXy + d 
= d(l + 5„(r)-e»"W). 

If l5t)('')l < li then we can approximate fv{r) = —d[gy{r)^ /2 + 0(.g„(r)'^)], and get 
exp(/„(r)) = (1 ± o(l)) exp(— d(7„(r)^/2). This is analogous to the beginning of the proof of 
Claim 3.8. Suppose |5d(»')| > 1- Then, arguing as in the proof of Claim 3.7, we deduce that 
exp(/i,(r)) < 2~^. The sum of all these terms over v is just a lower order term. So, we can 
substitute this by exp(— d5„(r)^/2). Hence, we can bound 



E[X,|T]^1^ Y: 5:exp[-d5.MV2] □ 



We now reach the main challenge of this proof. The quantity E[exp(— d(7i,(r)^/2)] is eval- 
uated by averaging over all noise. Note that the actual graph has no effect on this quantity. 

Lemma 4.13. Consider r = = [Oj] ■ 

E[exp(-dg,(r)V2)] > ^ 

Proof. Define ^r,d = {r — Od)lnT. Since 9d = ln(d/A)/lnr, 

dvir) = rlnr — ln(d/At,) = rlnr — ln(d/A) + In /9„ = ^r.d + hipu 

Hence, 

E[exp(-d5,(r)V2)] = E[exp[-d(lnp, + ir,df/2]] 

Since we set r = [Od'], \£,r,d\ ^ (lii''')/2- Let us now evaluate the expectation. The pdf of 
In is denoted by The expectation is given by an integral. To distinguish the d referring 
to degree, and the d referring to the infinitesimal, we shall use (d) in parenthesis for the 
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infinitesimal. We hope this shght abuse of notation will not create a problem, since our 
integrals are not too confusing. By Lemma 4.2, fv{x) > vi{c) for \x\ < r. 



E[exp(-dg„(r)V2)] 



+ 00 



exp[-d(a; + ^r,df /2]f^{x){dx) 



t^i(c) / cxp[—dx'^/2]{dx) 
viic) 

- J —00 
r+ir.d 

exp[-(ia;^/2] {dx) 

o 

We have \S,r.d\ < (lii''')/2 and r = max(2, Inr). Hence, t + ^r,d > 1 and — r + ^^.d < ^ 1- 



exp [~dx^ /2]{dx) - exp[-dx^ /2]{dx) 

Jr+ir-.d 



E[exp(-d(?„(r)V2)] > i/i(c) 



{,^i{c)/Vd) 



+ 00 /' + 00 

exp exp [-dx^/2]{dx) 

00 



exp[-dx^/2] (dx) 



— 00 >.' ^/d 



The first integral is just \/27r. The second is a tail probability of the standard Gaussian, 
bounded by /^+°° g-^'/^^x < e-^'I'^jy (Lemma 2, pg. 175 of [Feller 1968]). The second 

term is at most 2e''^^ ^"^ j \fd < y/n (for sufficiently large d). Therefore, we can set function 
i^4(c) such that E[exp(— d(7^,(r)^/2)] > 1/4(0)/-/^. □ 

Proof of Theorem 4.1. This is a direct consequence of the previous claims. Set 
r = Yd- By Claim 4.12 and linearity of expectation, E[X<j] = E[E[X<j|T]] > ((1 - 
o(l))/V^)I]„6S,, E[exp(-dg„(r)2/2)]. Lcmma4.13 tells us that 'E[exp{-dg^{r)^ /2)] > 
z/4(c)/Vd. Hence, E[Xrf] > □ 



4.6. Subtleties in adding noise 

One might ask why we add noise in this particular fashion, and whether other ways of 
adding noise are equally effective. Since we only need £ random numbers, it seems intuitive 
that adding "more noise" could only help. For example, we might add noise on a per edge 
basis, i.e., at each level i of every edge insertion, we choose a new random perturbation 
Ti of T. Interestingly, this version of noise does not smooth out the degree distribution, as 
shown in Figure 6. In this figure, the red curve corresponds to the degree distribution of the 
graph generated by NSKG with Graph500 parameters, i = 26, and 5 = 0.1. The blue curve 
corresponds to generation by adding noise per edge. As seen in this figure, adding noise 
per edge has hardly any effect on the oscillations, while NSKG provides a smooth degree 
distribution curve. (These results are fairly consistent over different parameter choices.) It 
is crucial that we use the same noisy Ti , . . . , for every edge insertion. 
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Noise per edge b=0.1 
NSKG b=0.1 




100000 1000000 



Fig. 6: Comparison of degree distribution of graphs generated by NSKG and by adding 
noise per edge for Graph500 parameters and £ ~ 26. 



5. EXPECTED NUMBER OF ISOLATED VERTICES 

In this section, we give a simple formula for the number of isolated vertices in SKG. We 
focus on the symmetric case, where t2 = is in the matrix T . We assume that t is even 
in the following, but the formula can be extended for £ being odd. The real contribution 
here is a clearer understanding of how many vertices SKG leaves isolated and how the SKG 
parameters affects this number. 

Theorem 5.1. Consider SKG with T symmetric and let I denote the number of isolated 
vertices. With probability 1 — o(l), 



r=-f./2 



t/2 

1/2 



(l±o(l)) V [^,:^ J expMAO. (3) 



Claim 5.2. Let be the probability that a single edge insertion produces an in-edge or 
out-edge incident to v d Sr- Then, for SKG with T symmetric, 

qr = {l±o{l))^ ^ . 



Proof. Let Eo (resp. £i) be the event that a single edge insertion is an in-edge (resp. 
out-edge) of v. We have qr = Pr(£o) -|-Pr(£'i) — Pr(fo Ufi). By Claim 3.3 and the symmetry 

to T, the first two probabilities are . The last is the probability that the edge 

insertion leads to a self-loop at v. This is at most Pr(£o)- Since a < 1, this is o(Pr(£o))- □ 

As before, we can assume that qr < l/^m. By Claim 3.4, if > > l/v^i then 
with probability tending to 1, vertices in slice r are not isolated. Hence, we can ignore such 
vertices when computing estimates for /. 



Claim 5.3. Let v € Sr and assume qr < l/\/rn. Then, for SKG with T symmetric, 
Pi[v is isolated] = (1 ± o(l)) exp(-2Ar''). 

Proof. Using Claim 5.2 and (1 - x)"' = (1 ± o(l))e~^"', for \x\ < 
(1 - qr)"' = (1 ± o(l)) exp(-2(l ± o(l))A(l - 4cr2)VV) = (1 ± o(l)) exp(-2(l ± o(l))Ar''). 
For large £, this converges to exp(— 2Ar'"). □ 
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Proof of Theorem 5.1. By Claim 5.3 and linearity of expectation, the expected num- 
ber of isolated vertices is 

r—£ /2 

r=-e/2 \ / ^ / 

To bound that actual number of isolated vertices, we use concentration inequalities for 
functions of independent random variables. Let Y denote the number of isolated vertices, 
and Xi, X2, ■ ■ ■ , Xjn be the labels of the m edge insertions. Note that all the X^'s are 
independent, and Y is some fixed function of Xi, X2, ■ ■ ■ , Xm- Suppose we fix all the edge 
insertions and just modify one insertion. Then, the number of isolated vertices can change 
by at most c = 2. Hence, the function defining Y satisfies a Lipschitz condition. This means 
that changing a single argument of Y (some Xi) modifies the value of Y by at most a 
constant (c). By McDiarmid's inequality [McDiarmid 1989], 

Pr[|r-E[r]| > e] < 2cxp (-^\ ■ 

Setting e — ^/mlogm, we get the probability that Y deviates from its expectation by more 
than \/ m log m is o(l). The expected number of vertices is at least (^^2) 6xp(— 2A), and 
log m is a lower order term with respect to this quantity. This completes the proof. □ 

The fraction of isolated vertices in a slice r is essentially exp(— Ar""). Note that r is larger 
than 1. Hence, this is a decreasing function of r. This is quite natural, since if a vertex v has 
many zeros in its representation (higher slice), then it is likely to have a larger degree (and 
less likely to be isolated). This function is doubly exponential in r, and therefore decreases 
quickly with r. The fraction of isolates rapidly goes to (resp. 1) as r is positive (resp. 
negative) . 

5.1. Effect of noise on isolated vertices 

The introduction of noise was quite successful in correcting the degree distribution but has 
little effect on the number of isolated vertices. This is not surprising, considering the noise 
affects fat tail behavior of the degree distribution. The number of isolated vertices is a 
different aspect of the degree distribution. The data presented in Table V clearly shows that 
the number of isolated vertices is quite resistant to noise. While there is some decrease in 
the number of isolated vertices, this quantity is very small compared to the total number 
of isolated vertices. We have observed similar results on the other parameter settings. 



Table V: Percentage of isolated vertices with different noise levels for the GRAPH500 
parameters and I — 26 



Max. noise level (6) 


% isolated vertices 





51.12 


0.05 


49.26 


0.06 


49.12 


0.07 


49.06 


0.08 


49.07 


0.09 


49.16 


0.1 


49.34 
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In addition to this empirical study, we can also give some mathematical intuition behind 
these observations. The equivalent statement of Claim 5.3 for NSKG is 

Pt[v is isolated] > (1 - o(l)) exp(-2AT'^) = (1 - o(l))[exp(-2AT'')]''" 

The noiseless version of this probability is [exp(— 2ATr)]. Note that the probability now is a 
random variable that depends on T, since py depends on the noise. Lemma 4.13 tells us that 
\npy lies mostly in the range [1 — c' 1 + c' (for constant c'), and is concentrated 
close to 1. 

We are mainly interested in the case when the probability that v is isolated is not van- 
ishingly small (is at least, say 0.01). As i grows, pv is close to being 1, and deviations are 
quite small. So, when we take the noiseless probability to the p^th power, we get almost 
the same value. 

5.2. Relation of SKG parameters to the number of isolated vertices: 

When A decreases, the number of isolated vertices increases. Suppose we fix the SKG matrix 
and average degree A, and start increasing £. Note that this is done in the Graph500 
benchmark, to construct larger and larger graphs. The value of A decreases exponentially in 
i, so the number of isolated vertices will increase. Our formula suggests ways of counteracting 
this problem. The value of A could be increased, or the value a could be decreased. But, in 
general, this will be a problem for generating large sparse graphs using a fixed SKG matrix. 

When a increases, then A decreases and r increases. Nonetheless, the effect of A is much 
stronger than that of t. Hence, the number of isolated vertices will increase as a increases. In 
Table II, we compute the estimated number of isolated vertices in graphs for the Graph500 
parameters. Observe how the fraction of isolated vertices consistently increases as £ is in- 
creased. For the largest setting of A: = 42, only one fourth of the vertices are not isolated. 

6. i^-CORES IN SKG 

Structures of fc-cores are an important part of social network analysis [Carmi ct al. 2007; 
Alvarez-Hamelin ct al. 2008; Kumar et al. 2010], as they are a manifestation of the com- 
munity structure and high connectivity of these graphs. 

Definition 6.1. Given an undirected graph G = {V,E), the subgraph induced by set 
S* C y, is denoted by G\s '■= {S, E'), where E' contains every edge of E that is completely 
contained in S. For an undirected graph, the k-core of G the largest induced subgraph 
of minimum degree k. The max core number of G is the largest k such that G contains a 
(non-empty) fc-core. (These can be extended to directed versions: a fc-out-core is a subgraph 
with min out-degree k.) 

A bipartite core is an induced subgraph with every vertex has either a high in-degree or 
out-degree. The former are called authorities and the latter are hubs. Large bipartite cores 
are present in web graphs and are an important structural component [Gibson ct al. 1998; 
Kleinbcrg 1999]. Note that if we make the directed graph undirected (by simply removing 
the directions), then a bipartite core becomes a normal core. Hence, it is useful to compute 
cores in a directed graph by making it undirected. 

We begin by comparing the sizes of /c-cores in real graphs, and their models using SKG 
[Leskovec ct al. 2010]. Refer to Figure?. We plot the size of the maximum fc-core with k. The 
k at which the curve ends is the max core number. (For CAHepPh, we look at undirected 
cores, since this is an undirected graph. For WEBNotreDame, a directed graph, we look 
at out-cores. But the empirical observations we make holds for all other core versions.) 
For both our examples, we see how drastically different the curves are. By far the most 
important difference is that the curve for the SKG versions are extremely short. This means 
that the max core number is much smaller for SKG modeled graphs compared to their 
real counterparts. For the web graph WEBNotreDame, we see the presence of large cores. 
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Fig. 7: Core decompositions of real graphs and their SKG model. Observe that the max 
core of SKG is an order of magnitude smaller. 

probably an indication of some community structure. The maximum core number of the 
SKG version is an order of magnitude smaller. Minor modifications (like increasing degree, 
or slight variation of parameters) to these graphs do not increase the core sizes or max cores 
numbers much. This is a problem, since this is strongly suggesting that SKG do not exhibit 
localized density like real web graphs or social networks. 

If we wish to use SKG to model real networks, then it is imperative to understand the be- 
havior of max core numbers for SKG. Indeed, in Table VI, we see that our observation is not 
just an artifact of our examples. SKG consistently have very low max core number. Only for 
the peer-to-peer Gnutella graphs does SKG match the real data, and this is specifically for 
the case where the max core number is extremely small. For the undirected graph (the first 
three co-authorship networks), we have computed the undirected cores. The corresponding 
SKG is generated by copying the upper triangular part in the lower half to get a symmetric 
matrix (an undirected graph). The remaining graphs are directed, and we simply remove 
the direction on the edges and compute the total core. Our observations hold for in and out 
cores as well, and for a wide range of data. This is an indication that SKG is not generating 
sufficiently dense subgraphs. 



Table VI: Core sizes in real graphs and SKG version 



Graph 


Real max core 


SKG max core 


CAGrQc 


43 


4 


CAHepPh 


238 


16 


CAHepTh 


31 


5 


CITHepPh 


30 


19 


CITHepTh 


37 


19 


P2PGnutella25 


5 


5 


P2PGnutella30 


7 


6 


SOCEpinions 


67 


43 


WEBNotreDame 


155 


31 
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We focus our attention on the max core number of SKG. How does this number change 
with the various parameters? The fohowing summarizes our observations. 

Empirical Observation 6.2. For SKG with symmetric T, we have the following ob- 
servations. 

(1) The max core number increases with a. By and large, if a < 0.1, max core numbers 
are extremely tiny. 

(2) Max core numbers grow with £ only when the values of a are sufficiently large. Even 
then, the growth is much slower than the size of the graph. For smaller a, max core numbers 
exhibit essentially negligible growth. 

(5) Max core numbers increase essentially linearly with A. 

Large max core numbers require larger values of a. As mentioned in §5, increasing a 
increases the number of isolated vertices. Hence, there is an inherent tension between in- 
creasing the max core number and decreasing the number of isolated vertices. 

For the sake of consistency, we performed the following experiments on the max core 
after taking a symmetric version of the SKG graph. Our results look the same for in and 
out cores as well. In Figure 8a, we show how increasing a increases the max core number. 
We fix the values of ^ = 16 and m = 6 x 2^^. (There is nothing special about these values. 
Indeed the results are basically identical, regardless of this choice.) Then, we fix ti (or ^2) 
to some value, and slowly increase a by increasing t2 (resp. ti). We see that regardless of 
the fixed values of ti (or ^2), the max core consistently increases. But as long as a < 0.1, 
max core numbers remain almost the same. 

In Figure 8b, we fix matrix T and average degree A, and only vary £. For WEB- 
NotreDame^, we have a = 0.18 and for CA-HEP-Ph, we have a = 0.11. For both cases, 
increasing £ barely increases the max core number. Despite increasing the graph size by 8 
orders of magnitude, the max core number only doubles. Contrast this with the GraphSOO 
setting, where a = 0.26, and we see a steady increase with larger £. This is a predictable 
pattern we notice for many different parameter settings: larger a leads to larger max core 
numbers as £ goes up. Finally, in Figure 8c, we see that the max core number is basically 
linear in A. 

6.1. Effect of noise on cores 

Our general intuition is that NSKG mainly redistributes edges of SKG to get a smooth 
degree distribution, but does not have major effects on the overall structure of the graph. 
This is somewhat validated by our studies on isolated vertices and reinforced by looking at 
fc-cores. In Figure 9, we plot the core decompositions of SKG and two versions on NSKG 
(6 = 0.05 and b = 0.1). We observe that there are little changes in these decompositions, 
although there is a smoothening of the curve for GraphSOO parameters. The problem of tiny 
cores of SKG is not mitigated by the addition of noise. 

7. CONCLUSIONS 

For a true understanding of a model, a careful theoretical and empirical study of its prop- 
erties in relation to its parameters is imperative. This not only provides insight into why 
certain properties arise, but also suggests ways for enhancement. One strength of the SKG 
model is its amenability to rigorous analysis, which we exploit in this paper. 

We prove strong theorems about the degree distribution, and more significantly show 
how adding noise can give a true lognormal distribution by eliminating the oscillations in 
degree distributions. Our proposed method of adding noise requires only £ random numbers 



^Even though the matrix T is not symmetric, we can still define a. Also, the off diagonal values are 0.20 
and 0.21, so they are almost equal. 
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(b) Varying £ 



150 



100 



50 



- CAHepPh 

- GraphSOO 

- WEBNotreDame 




10 

A 



15 



20 



(c) Varying A 

Fig. 8: We plot the max core number against various parameters. In the first picture, we 
plot the max core number of an (symmetric) SKG graph with increasing a . Next, we show 
how the max core number increases with the number of levels. Observe the major role 
that the matrix a plays. For GraphSOO, u is much larger than the other parameter sets. 
Finally, we show that regardless of the parameters, the max core number increases linearly 
with A. 



all together, and is hence cost effective. We want to stress that our major contribution is 
in providing hoth the theory and matching empirical evidence. The formula for expected 
number of isolated vertices provides an efficient alternative to methods for computing the 
full degree distribution. Besides requiring fewer operations to compute and being less prone 
to numerical errors, the formula transparently relates the expected number of isolated ver- 
tices to the SKG parameters. Our studies on core numbers establish a connection between 
the model parameters and the cores of the resulting graphs. In particular, we show that 
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noise = 0.05 
noise = 0.1 




1 2 

10 10 10 

l< 

(c) CAHepPH 

Fig. 9: We plot the core decomposition of SKG and NSKG (with 2 settings of noise) for the 
different parameters. Observe that there is only a minor change in core sizes with noise. 

commonly used SKG parameters generate tiny cores, and the model's ability to generate 
large cores is limited. 
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